Overview

Dataset statistics

Number of variables33
Number of observations199522
Missing cells0
Missing cells (%)0.0%
Duplicate rows3172
Duplicate rows (%)1.6%
Total size in memory50.2 MiB
Average record size in memory264.0 B

Variable types

Numeric10
Categorical22
Text1

Alerts

Dataset has 3172 (1.6%) duplicate rowsDuplicates
edu_inst is highly imbalanced (74.6%)Imbalance
mace is highly imbalanced (62.2%)Imbalance
hispanic is highly imbalanced (71.7%)Imbalance
labor_union is highly imbalanced (67.5%)Imbalance
reason_unemployment is highly imbalanced (89.9%)Imbalance
migration_msa is highly imbalanced (55.1%)Imbalance
migration_reg is highly imbalanced (52.5%)Imbalance
migration_within is highly imbalanced (54.5%)Imbalance
citizen is highly imbalanced (70.8%)Imbalance
person_income is highly imbalanced (68.0%)Imbalance
own_bus is highly imbalanced (94.5%)Imbalance
income is highly imbalanced (66.4%)Imbalance
divdends is highly skewed (γ1 = 27.78643274)Skewed
age has 2839 (1.4%) zerosZeros
industry_code has 100683 (50.5%) zerosZeros
occupation_code has 100683 (50.5%) zerosZeros
wage_per_hour has 188218 (94.3%) zerosZeros
gains has 192143 (96.3%) zerosZeros
losses has 195616 (98.0%) zerosZeros
divdends has 178381 (89.4%) zerosZeros
person_worked has 95982 (48.1%) zerosZeros
week_workd has 95982 (48.1%) zerosZeros

Reproduction

Analysis started2024-05-18 10:35:55.169077
Analysis finished2024-05-18 10:36:05.279845
Duration10.11 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

age
Real number (ℝ)

ZEROS 

Distinct91
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.494006
Minimum0
Maximum90
Zeros2839
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:05.329837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q115
median33
Q350
95-th percentile75
Maximum90
Range90
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.310785
Coefficient of variation (CV)0.64680179
Kurtosis-0.73279952
Mean34.494006
Median Absolute Deviation (MAD)17
Skewness0.37329807
Sum6882313
Variance497.77111
MonotonicityNot monotonic
2024-05-18T16:06:05.381203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34 3489
 
1.7%
35 3450
 
1.7%
36 3353
 
1.7%
31 3351
 
1.7%
33 3340
 
1.7%
5 3332
 
1.7%
4 3318
 
1.7%
3 3279
 
1.6%
37 3278
 
1.6%
38 3277
 
1.6%
Other values (81) 166055
83.2%
ValueCountFrequency (%)
0 2839
1.4%
1 3138
1.6%
2 3236
1.6%
3 3279
1.6%
4 3318
1.7%
5 3332
1.7%
6 3171
1.6%
7 3218
1.6%
8 3187
1.6%
9 3162
1.6%
ValueCountFrequency (%)
90 725
0.4%
89 195
 
0.1%
88 241
 
0.1%
87 301
0.2%
86 348
0.2%
85 423
0.2%
84 519
0.3%
83 561
0.3%
82 615
0.3%
81 720
0.4%

class_of_worker
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
100244 
Private
72028 
Self-employed-not incorporated
 
8445
Local government
 
7784
State government
 
4227
Other values (4)
 
6794

Length

Max length31
Median length16
Mean length14.021146
Min length8

Characters and Unicode

Total characters2797527
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Self-employed-not incorporated
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Private

Common Values

ValueCountFrequency (%)
Not in universe 100244
50.2%
Private 72028
36.1%
Self-employed-not incorporated 8445
 
4.2%
Local government 7784
 
3.9%
State government 4227
 
2.1%
Self-employed-incorporated 3265
 
1.6%
Federal government 2925
 
1.5%
Never worked 439
 
0.2%
Without pay 165
 
0.1%

Length

2024-05-18T16:06:05.427971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:05.473596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 100244
23.6%
in 100244
23.6%
universe 100244
23.6%
private 72028
17.0%
government 14936
 
3.5%
self-employed-not 8445
 
2.0%
incorporated 8445
 
2.0%
local 7784
 
1.8%
state 4227
 
1.0%
self-employed-incorporated 3265
 
0.8%
Other values (5) 4133
 
1.0%

Most occurring characters

ValueCountFrequency (%)
423995
15.2%
e 360622
12.9%
i 284391
10.2%
n 250515
9.0%
t 216147
7.7%
r 214431
7.7%
v 187647
 
6.7%
o 167143
 
6.0%
N 100683
 
3.6%
u 100409
 
3.6%
Other values (19) 491544
17.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2150590
76.9%
Space Separator 423995
 
15.2%
Uppercase Letter 199522
 
7.1%
Dash Punctuation 23420
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 360622
16.8%
i 284391
13.2%
n 250515
11.6%
t 216147
10.1%
r 214431
10.0%
v 187647
8.7%
o 167143
7.8%
u 100409
 
4.7%
s 100244
 
4.7%
a 98839
 
4.6%
Other values (11) 170202
7.9%
Uppercase Letter
ValueCountFrequency (%)
N 100683
50.5%
P 72028
36.1%
S 15937
 
8.0%
L 7784
 
3.9%
F 2925
 
1.5%
W 165
 
0.1%
Space Separator
ValueCountFrequency (%)
423995
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23420
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2350112
84.0%
Common 447415
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 360622
15.3%
i 284391
12.1%
n 250515
10.7%
t 216147
9.2%
r 214431
9.1%
v 187647
8.0%
o 167143
7.1%
N 100683
 
4.3%
u 100409
 
4.3%
s 100244
 
4.3%
Other values (17) 367880
15.7%
Common
ValueCountFrequency (%)
423995
94.8%
- 23420
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2797527
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
423995
15.2%
e 360622
12.9%
i 284391
10.2%
n 250515
9.0%
t 216147
7.7%
r 214431
7.7%
v 187647
 
6.7%
o 167143
 
6.0%
N 100683
 
3.6%
u 100409
 
3.6%
Other values (19) 491544
17.6%

industry_code
Real number (ℝ)

ZEROS 

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.352397
Minimum0
Maximum51
Zeros100683
Zeros (%)50.5%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:05.529627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q333
95-th percentile44
Maximum51
Range51
Interquartile range (IQR)33

Descriptive statistics

Standard deviation18.067141
Coefficient of variation (CV)1.1768287
Kurtosis-1.501116
Mean15.352397
Median Absolute Deviation (MAD)0
Skewness0.51667949
Sum3063141
Variance326.4216
MonotonicityNot monotonic
2024-05-18T16:06:05.580850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 100683
50.5%
33 17070
 
8.6%
43 8283
 
4.2%
4 5984
 
3.0%
42 4683
 
2.3%
45 4482
 
2.2%
29 4209
 
2.1%
37 4022
 
2.0%
41 3964
 
2.0%
32 3596
 
1.8%
Other values (42) 42546
21.3%
ValueCountFrequency (%)
0 100683
50.5%
1 827
 
0.4%
2 2196
 
1.1%
3 563
 
0.3%
4 5984
 
3.0%
5 553
 
0.3%
6 554
 
0.3%
7 422
 
0.2%
8 550
 
0.3%
9 993
 
0.5%
ValueCountFrequency (%)
51 36
 
< 0.1%
50 1704
 
0.9%
49 610
 
0.3%
48 652
 
0.3%
47 1644
 
0.8%
46 187
 
0.1%
45 4482
2.2%
44 2549
 
1.3%
43 8283
4.2%
42 4683
2.3%

occupation_code
Real number (ℝ)

ZEROS 

Distinct47
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.306613
Minimum0
Maximum46
Zeros100683
Zeros (%)50.5%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:05.629384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q326
95-th percentile38
Maximum46
Range46
Interquartile range (IQR)26

Descriptive statistics

Standard deviation14.454218
Coefficient of variation (CV)1.2783862
Kurtosis-0.89654589
Mean11.306613
Median Absolute Deviation (MAD)0
Skewness0.82923051
Sum2255918
Variance208.92442
MonotonicityNot monotonic
2024-05-18T16:06:05.677910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
0 100683
50.5%
2 8756
 
4.4%
26 7887
 
4.0%
19 5413
 
2.7%
29 5105
 
2.6%
36 4145
 
2.1%
34 4025
 
2.0%
10 3683
 
1.8%
16 3445
 
1.7%
23 3392
 
1.7%
Other values (37) 52988
26.6%
ValueCountFrequency (%)
0 100683
50.5%
1 544
 
0.3%
2 8756
 
4.4%
3 3195
 
1.6%
4 1364
 
0.7%
5 855
 
0.4%
6 441
 
0.2%
7 731
 
0.4%
8 2151
 
1.1%
9 738
 
0.4%
ValueCountFrequency (%)
46 36
 
< 0.1%
45 172
 
0.1%
44 1592
0.8%
43 1382
0.7%
42 1918
1.0%
41 1592
0.8%
40 617
 
0.3%
39 1017
 
0.5%
38 3003
1.5%
37 2234
1.1%

education
Categorical

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
High school graduate
48406 
Children
47422 
Some college but no degree
27820 
Bachelors degree(BA AB BS)
19865 
7th and 8th grade
8007 
Other values (12)
48002 

Length

Max length39
Median length35
Mean length19.86398
Min length9

Characters and Unicode

Total characters3963301
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Some college but no degree
2nd row 10th grade
3rd row Children
4th row Children
5th row Some college but no degree

Common Values

ValueCountFrequency (%)
High school graduate 48406
24.3%
Children 47422
23.8%
Some college but no degree 27820
13.9%
Bachelors degree(BA AB BS) 19865
10.0%
7th and 8th grade 8007
 
4.0%
10th grade 7557
 
3.8%
11th grade 6876
 
3.4%
Masters degree(MA MS MEng MEd MSW MBA) 6541
 
3.3%
9th grade 6230
 
3.1%
Associates degree-occup /vocational 5358
 
2.7%
Other values (7) 15440
 
7.7%

Length

2024-05-18T16:06:05.727441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
school 50199
 
8.2%
graduate 48406
 
7.9%
high 48406
 
7.9%
children 47422
 
7.7%
grade 36691
 
6.0%
no 29946
 
4.9%
degree 29613
 
4.8%
some 27820
 
4.5%
college 27820
 
4.5%
but 27820
 
4.5%
Other values (42) 239176
39.0%

Most occurring characters

ValueCountFrequency (%)
613319
15.5%
e 459560
 
11.6%
o 247528
 
6.2%
r 244585
 
6.2%
g 239230
 
6.0%
d 225420
 
5.7%
h 215130
 
5.4%
a 205650
 
5.2%
l 180610
 
4.6%
t 150965
 
3.8%
Other values (37) 1181304
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2803273
70.7%
Space Separator 613319
 
15.5%
Uppercase Letter 402775
 
10.2%
Decimal Number 69931
 
1.8%
Close Punctuation 29462
 
0.7%
Open Punctuation 29462
 
0.7%
Dash Punctuation 9721
 
0.2%
Other Punctuation 5358
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 459560
16.4%
o 247528
8.8%
r 244585
8.7%
g 239230
8.5%
d 225420
8.0%
h 215130
7.7%
a 205650
7.3%
l 180610
 
6.4%
t 150965
 
5.4%
c 133668
 
4.8%
Other values (9) 500927
17.9%
Uppercase Letter
ValueCountFrequency (%)
B 87794
21.8%
S 62560
15.5%
A 62533
15.5%
M 49373
12.3%
H 48406
12.0%
C 47422
11.8%
E 14345
 
3.6%
D 12754
 
3.2%
W 6541
 
1.6%
L 4405
 
1.1%
Other values (3) 6642
 
1.6%
Decimal Number
ValueCountFrequency (%)
1 26053
37.3%
7 8007
 
11.4%
8 8007
 
11.4%
0 7557
 
10.8%
9 6230
 
8.9%
2 3925
 
5.6%
5 3277
 
4.7%
6 3277
 
4.7%
3 1799
 
2.6%
4 1799
 
2.6%
Space Separator
ValueCountFrequency (%)
613319
100.0%
Close Punctuation
ValueCountFrequency (%)
) 29462
100.0%
Open Punctuation
ValueCountFrequency (%)
( 29462
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9721
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 5358
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3206048
80.9%
Common 757253
 
19.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 459560
14.3%
o 247528
 
7.7%
r 244585
 
7.6%
g 239230
 
7.5%
d 225420
 
7.0%
h 215130
 
6.7%
a 205650
 
6.4%
l 180610
 
5.6%
t 150965
 
4.7%
c 133668
 
4.2%
Other values (22) 903702
28.2%
Common
ValueCountFrequency (%)
613319
81.0%
) 29462
 
3.9%
( 29462
 
3.9%
1 26053
 
3.4%
- 9721
 
1.3%
7 8007
 
1.1%
8 8007
 
1.1%
0 7557
 
1.0%
9 6230
 
0.8%
/ 5358
 
0.7%
Other values (5) 14077
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3963301
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
613319
15.5%
e 459560
 
11.6%
o 247528
 
6.2%
r 244585
 
6.2%
g 239230
 
6.0%
d 225420
 
5.7%
h 215130
 
5.4%
a 205650
 
5.2%
l 180610
 
4.6%
t 150965
 
3.8%
Other values (37) 1181304
29.8%

wage_per_hour
Real number (ℝ)

ZEROS 

Distinct1240
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.427186
Minimum0
Maximum9999
Zeros188218
Zeros (%)94.3%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:05.776702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile495
Maximum9999
Range9999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation274.89711
Coefficient of variation (CV)4.959608
Kurtosis155.21813
Mean55.427186
Median Absolute Deviation (MAD)0
Skewness8.9350739
Sum11058943
Variance75568.424
MonotonicityNot monotonic
2024-05-18T16:06:05.828079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 188218
94.3%
500 734
 
0.4%
600 546
 
0.3%
700 534
 
0.3%
800 507
 
0.3%
1000 386
 
0.2%
425 376
 
0.2%
900 336
 
0.2%
550 280
 
0.1%
1200 256
 
0.1%
Other values (1230) 7349
 
3.7%
ValueCountFrequency (%)
0 188218
94.3%
20 1
 
< 0.1%
70 1
 
< 0.1%
75 2
 
< 0.1%
100 11
 
< 0.1%
110 1
 
< 0.1%
125 1
 
< 0.1%
135 1
 
< 0.1%
143 1
 
< 0.1%
150 6
 
< 0.1%
ValueCountFrequency (%)
9999 1
 
< 0.1%
9916 1
 
< 0.1%
9800 2
< 0.1%
9400 2
< 0.1%
9000 1
 
< 0.1%
8800 1
 
< 0.1%
8600 1
 
< 0.1%
8500 1
 
< 0.1%
8300 1
 
< 0.1%
8000 4
< 0.1%

edu_inst
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
186942 
High school
 
6892
College or university
 
5688

Length

Max length22
Median length16
Mean length16.032879
Min length12

Characters and Unicode

Total characters3198912
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row High school
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe 186942
93.7%
High school 6892
 
3.5%
College or university 5688
 
2.9%

Length

2024-05-18T16:06:05.874737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:05.912472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 186942
31.6%
in 186942
31.6%
universe 186942
31.6%
high 6892
 
1.2%
school 6892
 
1.2%
college 5688
 
1.0%
or 5688
 
1.0%
university 5688
 
1.0%

Most occurring characters

ValueCountFrequency (%)
591674
18.5%
i 392152
12.3%
e 390948
12.2%
n 379572
11.9%
o 212102
 
6.6%
s 199522
 
6.2%
r 198318
 
6.2%
v 192630
 
6.0%
u 192630
 
6.0%
t 192630
 
6.0%
Other values (8) 256734
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2407716
75.3%
Space Separator 591674
 
18.5%
Uppercase Letter 199522
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 392152
16.3%
e 390948
16.2%
n 379572
15.8%
o 212102
8.8%
s 199522
8.3%
r 198318
8.2%
v 192630
8.0%
u 192630
8.0%
t 192630
8.0%
l 18268
 
0.8%
Other values (4) 38944
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
N 186942
93.7%
H 6892
 
3.5%
C 5688
 
2.9%
Space Separator
ValueCountFrequency (%)
591674
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2607238
81.5%
Common 591674
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 392152
15.0%
e 390948
15.0%
n 379572
14.6%
o 212102
8.1%
s 199522
7.7%
r 198318
7.6%
v 192630
7.4%
u 192630
7.4%
t 192630
7.4%
N 186942
7.2%
Other values (7) 69792
 
2.7%
Common
ValueCountFrequency (%)
591674
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3198912
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
591674
18.5%
i 392152
12.3%
e 390948
12.2%
n 379572
11.9%
o 212102
 
6.6%
s 199522
 
6.2%
r 198318
 
6.2%
v 192630
 
6.0%
u 192630
 
6.0%
t 192630
 
6.0%
Other values (8) 256734
8.0%

marital
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Never married
86485 
Married-civilian spouse present
84222 
Divorced
12710 
Widowed
10462 
Separated
 
3460
Other values (2)
 
2183

Length

Max length32
Median length27
Mean length20.999845
Min length8

Characters and Unicode

Total characters4189931
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Divorced
2nd row Never married
3rd row Never married
4th row Never married
5th row Married-civilian spouse present

Common Values

ValueCountFrequency (%)
Never married 86485
43.3%
Married-civilian spouse present 84222
42.2%
Divorced 12710
 
6.4%
Widowed 10462
 
5.2%
Separated 3460
 
1.7%
Married-spouse absent 1518
 
0.8%
Married-A F spouse present 665
 
0.3%

Length

2024-05-18T16:06:05.952354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:05.995192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
never 86485
18.9%
married 86485
18.9%
spouse 84887
18.5%
present 84887
18.5%
married-civilian 84222
18.4%
divorced 12710
 
2.8%
widowed 10462
 
2.3%
separated 3460
 
0.8%
married-spouse 1518
 
0.3%
absent 1518
 
0.3%
Other values (2) 1330
 
0.3%

Most occurring characters

ValueCountFrequency (%)
e 633649
15.1%
r 533322
12.7%
457964
10.9%
i 448728
10.7%
a 265550
 
6.3%
s 259215
 
6.2%
d 209984
 
5.0%
v 183417
 
4.4%
p 174752
 
4.2%
n 170627
 
4.1%
Other values (16) 852723
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3444710
82.2%
Space Separator 457964
 
10.9%
Uppercase Letter 200852
 
4.8%
Dash Punctuation 86405
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 633649
18.4%
r 533322
15.5%
i 448728
13.0%
a 265550
7.7%
s 259215
7.5%
d 209984
 
6.1%
v 183417
 
5.3%
p 174752
 
5.1%
n 170627
 
5.0%
o 109577
 
3.2%
Other values (7) 455889
13.2%
Uppercase Letter
ValueCountFrequency (%)
N 86485
43.1%
M 86405
43.0%
D 12710
 
6.3%
W 10462
 
5.2%
S 3460
 
1.7%
A 665
 
0.3%
F 665
 
0.3%
Space Separator
ValueCountFrequency (%)
457964
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 86405
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3645562
87.0%
Common 544369
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 633649
17.4%
r 533322
14.6%
i 448728
12.3%
a 265550
7.3%
s 259215
 
7.1%
d 209984
 
5.8%
v 183417
 
5.0%
p 174752
 
4.8%
n 170627
 
4.7%
o 109577
 
3.0%
Other values (14) 656741
18.0%
Common
ValueCountFrequency (%)
457964
84.1%
- 86405
 
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4189931
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 633649
15.1%
r 533322
12.7%
457964
10.9%
i 448728
10.7%
a 265550
 
6.3%
s 259215
 
6.2%
d 209984
 
5.0%
v 183417
 
4.4%
p 174752
 
4.2%
n 170627
 
4.1%
Other values (16) 852723
20.4%

mace
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
White
167364 
Black
20415 
Asian or Pacific Islander
 
5835
Other
 
3657
Amer Indian Aleut or Eskimo
 
2251

Length

Max length28
Median length6
Mean length6.8331011
Min length6

Characters and Unicode

Total characters1363354
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row White
2nd row Asian or Pacific Islander
3rd row White
4th row White
5th row Amer Indian Aleut or Eskimo

Common Values

ValueCountFrequency (%)
White 167364
83.9%
Black 20415
 
10.2%
Asian or Pacific Islander 5835
 
2.9%
Other 3657
 
1.8%
Amer Indian Aleut or Eskimo 2251
 
1.1%

Length

2024-05-18T16:06:06.043601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.082244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
white 167364
74.0%
black 20415
 
9.0%
or 8086
 
3.6%
asian 5835
 
2.6%
pacific 5835
 
2.6%
islander 5835
 
2.6%
other 3657
 
1.6%
amer 2251
 
1.0%
indian 2251
 
1.0%
aleut 2251
 
1.0%

Most occurring characters

ValueCountFrequency (%)
226031
16.6%
i 189371
13.9%
e 181358
13.3%
t 173272
12.7%
h 171021
12.5%
W 167364
12.3%
a 40171
 
2.9%
c 32085
 
2.4%
l 28501
 
2.1%
k 22666
 
1.7%
Other values (14) 131514
9.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 919378
67.4%
Space Separator 226031
 
16.6%
Uppercase Letter 217945
 
16.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 189371
20.6%
e 181358
19.7%
t 173272
18.8%
h 171021
18.6%
a 40171
 
4.4%
c 32085
 
3.5%
l 28501
 
3.1%
k 22666
 
2.5%
r 19829
 
2.2%
n 16172
 
1.8%
Other values (6) 44932
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
W 167364
76.8%
B 20415
 
9.4%
A 10337
 
4.7%
I 8086
 
3.7%
P 5835
 
2.7%
O 3657
 
1.7%
E 2251
 
1.0%
Space Separator
ValueCountFrequency (%)
226031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1137323
83.4%
Common 226031
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 189371
16.7%
e 181358
15.9%
t 173272
15.2%
h 171021
15.0%
W 167364
14.7%
a 40171
 
3.5%
c 32085
 
2.8%
l 28501
 
2.5%
k 22666
 
2.0%
B 20415
 
1.8%
Other values (13) 111099
9.8%
Common
ValueCountFrequency (%)
226031
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1363354
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
226031
16.6%
i 189371
13.9%
e 181358
13.3%
t 173272
12.7%
h 171021
12.5%
W 167364
12.3%
a 40171
 
2.9%
c 32085
 
2.4%
l 28501
 
2.1%
k 22666
 
1.7%
Other values (14) 131514
9.6%

hispanic
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
All other
171906 
Mexican-American
 
8079
Mexican (Mexicano)
 
7234
Central or South American
 
3895
Puerto Rican
 
3313
Other values (5)
 
5095

Length

Max length26
Median length10
Mean length10.968515
Min length3

Characters and Unicode

Total characters2188460
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row All other
2nd row All other
3rd row All other
4th row All other
5th row All other

Common Values

ValueCountFrequency (%)
All other 171906
86.2%
Mexican-American 8079
 
4.0%
Mexican (Mexicano) 7234
 
3.6%
Central or South American 3895
 
2.0%
Puerto Rican 3313
 
1.7%
Other Spanish 2485
 
1.2%
Cuban 1126
 
0.6%
NA 874
 
0.4%
Do not know 306
 
0.2%
Chicano 304
 
0.2%

Length

2024-05-18T16:06:06.130396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.175913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
other 174391
44.0%
all 171906
43.3%
mexican-american 8079
 
2.0%
mexican 7234
 
1.8%
mexicano 7234
 
1.8%
central 3895
 
1.0%
or 3895
 
1.0%
south 3895
 
1.0%
american 3895
 
1.0%
rican 3313
 
0.8%
Other values (8) 9020
 
2.3%

Most occurring characters

ValueCountFrequency (%)
396757
18.1%
l 347707
15.9%
e 216120
9.9%
r 197468
9.0%
o 191465
8.7%
t 185800
8.5%
A 184754
8.4%
h 181075
8.3%
n 46256
 
2.1%
a 45644
 
2.1%
Other values (21) 195414
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1539859
70.4%
Space Separator 396757
 
18.1%
Uppercase Letter 229297
 
10.5%
Dash Punctuation 8079
 
0.4%
Open Punctuation 7234
 
0.3%
Close Punctuation 7234
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 347707
22.6%
e 216120
14.0%
r 197468
12.8%
o 191465
12.4%
t 185800
12.1%
h 181075
11.8%
n 46256
 
3.0%
a 45644
 
3.0%
i 40623
 
2.6%
c 38138
 
2.5%
Other values (8) 49563
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
A 184754
80.6%
M 22547
 
9.8%
S 6380
 
2.8%
C 5325
 
2.3%
P 3313
 
1.4%
R 3313
 
1.4%
O 2485
 
1.1%
N 874
 
0.4%
D 306
 
0.1%
Space Separator
ValueCountFrequency (%)
396757
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8079
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7234
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1769156
80.8%
Common 419304
 
19.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 347707
19.7%
e 216120
12.2%
r 197468
11.2%
o 191465
10.8%
t 185800
10.5%
A 184754
10.4%
h 181075
10.2%
n 46256
 
2.6%
a 45644
 
2.6%
i 40623
 
2.3%
Other values (17) 132244
 
7.5%
Common
ValueCountFrequency (%)
396757
94.6%
- 8079
 
1.9%
( 7234
 
1.7%
) 7234
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2188460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
396757
18.1%
l 347707
15.9%
e 216120
9.9%
r 197468
9.0%
o 191465
8.7%
t 185800
8.5%
A 184754
8.4%
h 181075
8.3%
n 46256
 
2.1%
a 45644
 
2.1%
Other values (21) 195414
8.9%

sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Female
103983 
Male
95539 

Length

Max length7
Median length7
Mean length6.0423211
Min length5

Characters and Unicode

Total characters1205576
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Male
2nd row Female
3rd row Female
4th row Female
5th row Female

Common Values

ValueCountFrequency (%)
Female 103983
52.1%
Male 95539
47.9%

Length

2024-05-18T16:06:06.230729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.270201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
female 103983
52.1%
male 95539
47.9%

Most occurring characters

ValueCountFrequency (%)
e 303505
25.2%
199522
16.5%
a 199522
16.5%
l 199522
16.5%
F 103983
 
8.6%
m 103983
 
8.6%
M 95539
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 806532
66.9%
Space Separator 199522
 
16.5%
Uppercase Letter 199522
 
16.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 303505
37.6%
a 199522
24.7%
l 199522
24.7%
m 103983
 
12.9%
Uppercase Letter
ValueCountFrequency (%)
F 103983
52.1%
M 95539
47.9%
Space Separator
ValueCountFrequency (%)
199522
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1006054
83.5%
Common 199522
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 303505
30.2%
a 199522
19.8%
l 199522
19.8%
F 103983
 
10.3%
m 103983
 
10.3%
M 95539
 
9.5%
Common
ValueCountFrequency (%)
199522
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1205576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 303505
25.2%
199522
16.5%
a 199522
16.5%
l 199522
16.5%
F 103983
 
8.6%
m 103983
 
8.6%
M 95539
 
7.9%

labor_union
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
180458 
No
 
16034
Yes
 
3030

Length

Max length16
Median length16
Mean length14.773058
Min length3

Characters and Unicode

Total characters2947550
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row No

Common Values

ValueCountFrequency (%)
Not in universe 180458
90.4%
No 16034
 
8.0%
Yes 3030
 
1.5%

Length

2024-05-18T16:06:06.311853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.350982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 180458
32.2%
in 180458
32.2%
universe 180458
32.2%
no 16034
 
2.9%
yes 3030
 
0.5%

Most occurring characters

ValueCountFrequency (%)
560438
19.0%
e 363946
12.3%
i 360916
12.2%
n 360916
12.2%
N 196492
 
6.7%
o 196492
 
6.7%
s 183488
 
6.2%
t 180458
 
6.1%
u 180458
 
6.1%
v 180458
 
6.1%
Other values (2) 183488
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2187590
74.2%
Space Separator 560438
 
19.0%
Uppercase Letter 199522
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 363946
16.6%
i 360916
16.5%
n 360916
16.5%
o 196492
9.0%
s 183488
8.4%
t 180458
8.2%
u 180458
8.2%
v 180458
8.2%
r 180458
8.2%
Uppercase Letter
ValueCountFrequency (%)
N 196492
98.5%
Y 3030
 
1.5%
Space Separator
ValueCountFrequency (%)
560438
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2387112
81.0%
Common 560438
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 363946
15.2%
i 360916
15.1%
n 360916
15.1%
N 196492
8.2%
o 196492
8.2%
s 183488
7.7%
t 180458
7.6%
u 180458
7.6%
v 180458
7.6%
r 180458
7.6%
Common
ValueCountFrequency (%)
560438
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2947550
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
560438
19.0%
e 363946
12.3%
i 360916
12.2%
n 360916
12.2%
N 196492
 
6.7%
o 196492
 
6.7%
s 183488
 
6.2%
t 180458
 
6.1%
u 180458
 
6.1%
v 180458
 
6.1%
Other values (2) 183488
 
6.2%

reason_unemployment
Categorical

IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
193452 
Other job loser
 
2038
Re-entrant
 
2019
Job loser - on layoff
 
976
Job leaver
 
598

Length

Max length22
Median length16
Mean length15.954967
Min length11

Characters and Unicode

Total characters3183367
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe 193452
97.0%
Other job loser 2038
 
1.0%
Re-entrant 2019
 
1.0%
Job loser - on layoff 976
 
0.5%
Job leaver 598
 
0.3%
New entrant 439
 
0.2%

Length

2024-05-18T16:06:06.391688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.431709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 193452
32.5%
in 193452
32.5%
universe 193452
32.5%
job 3612
 
0.6%
loser 3014
 
0.5%
other 2038
 
0.3%
re-entrant 2019
 
0.3%
976
 
0.2%
on 976
 
0.2%
layoff 976
 
0.2%
Other values (3) 1476
 
0.2%

Most occurring characters

ValueCountFrequency (%)
595443
18.7%
e 398068
12.5%
n 392796
12.3%
i 386904
12.2%
o 202030
 
6.3%
r 201560
 
6.3%
t 200406
 
6.3%
s 196466
 
6.2%
v 194050
 
6.1%
N 193891
 
6.1%
Other values (13) 221753
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2385407
74.9%
Space Separator 595443
 
18.7%
Uppercase Letter 199522
 
6.3%
Dash Punctuation 2995
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 398068
16.7%
n 392796
16.5%
i 386904
16.2%
o 202030
8.5%
r 201560
8.4%
t 200406
8.4%
s 196466
8.2%
v 194050
8.1%
u 193452
8.1%
l 4588
 
0.2%
Other values (7) 15087
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
N 193891
97.2%
O 2038
 
1.0%
R 2019
 
1.0%
J 1574
 
0.8%
Space Separator
ValueCountFrequency (%)
595443
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2995
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2584929
81.2%
Common 598438
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 398068
15.4%
n 392796
15.2%
i 386904
15.0%
o 202030
7.8%
r 201560
7.8%
t 200406
7.8%
s 196466
7.6%
v 194050
7.5%
N 193891
7.5%
u 193452
7.5%
Other values (11) 25306
 
1.0%
Common
ValueCountFrequency (%)
595443
99.5%
- 2995
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3183367
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
595443
18.7%
e 398068
12.5%
n 392796
12.3%
i 386904
12.2%
o 202030
 
6.3%
r 201560
 
6.3%
t 200406
 
6.3%
s 196466
 
6.2%
v 194050
 
6.1%
N 193891
 
6.1%
Other values (13) 221753
 
7.0%

employment_type
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Children or Armed Forces
123769 
Full-time schedules
40736 
Not in labor force
26807 
PT for non-econ reasons usually FT
 
3322
Unemployed full-time
 
2311
Other values (3)
 
2577

Length

Max length35
Median length25
Mean length23.33266
Min length19

Characters and Unicode

Total characters4655379
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Children or Armed Forces
2nd row Not in labor force
3rd row Children or Armed Forces
4th row Children or Armed Forces
5th row Full-time schedules

Common Values

ValueCountFrequency (%)
Children or Armed Forces 123769
62.0%
Full-time schedules 40736
 
20.4%
Not in labor force 26807
 
13.4%
PT for non-econ reasons usually FT 3322
 
1.7%
Unemployed full-time 2311
 
1.2%
PT for econ reasons usually PT 1209
 
0.6%
Unemployed part- time 843
 
0.4%
PT for econ reasons usually FT 525
 
0.3%

Length

2024-05-18T16:06:06.482750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.537111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
children 123769
17.2%
or 123769
17.2%
armed 123769
17.2%
forces 123769
17.2%
full-time 43047
 
6.0%
schedules 40736
 
5.6%
not 26807
 
3.7%
in 26807
 
3.7%
labor 26807
 
3.7%
force 26807
 
3.7%
Other values (10) 35176
 
4.9%

Most occurring characters

ValueCountFrequency (%)
721263
15.5%
r 559645
12.0%
e 539896
11.6%
o 349603
 
7.5%
d 291428
 
6.3%
l 290672
 
6.2%
s 220409
 
4.7%
c 196368
 
4.2%
i 194466
 
4.2%
m 170813
 
3.7%
Other values (17) 1120816
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3424676
73.6%
Space Separator 721263
 
15.5%
Uppercase Letter 462228
 
9.9%
Dash Punctuation 47212
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 559645
16.3%
e 539896
15.8%
o 349603
10.2%
d 291428
8.5%
l 290672
8.5%
s 220409
 
6.4%
c 196368
 
5.7%
i 194466
 
5.7%
m 170813
 
5.0%
n 170486
 
5.0%
Other values (8) 440890
12.9%
Uppercase Letter
ValueCountFrequency (%)
F 168352
36.4%
A 123769
26.8%
C 123769
26.8%
N 26807
 
5.8%
T 10112
 
2.2%
P 6265
 
1.4%
U 3154
 
0.7%
Space Separator
ValueCountFrequency (%)
721263
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 47212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3886904
83.5%
Common 768475
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 559645
14.4%
e 539896
13.9%
o 349603
 
9.0%
d 291428
 
7.5%
l 290672
 
7.5%
s 220409
 
5.7%
c 196368
 
5.1%
i 194466
 
5.0%
m 170813
 
4.4%
n 170486
 
4.4%
Other values (15) 903118
23.2%
Common
ValueCountFrequency (%)
721263
93.9%
- 47212
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4655379
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
721263
15.5%
r 559645
12.0%
e 539896
11.6%
o 349603
 
7.5%
d 291428
 
6.3%
l 290672
 
6.2%
s 220409
 
4.7%
c 196368
 
4.2%
i 194466
 
4.2%
m 170813
 
3.7%
Other values (17) 1120816
24.1%

gains
Real number (ℝ)

ZEROS 

Distinct132
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean434.72117
Minimum0
Maximum99999
Zeros192143
Zeros (%)96.3%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:06.602535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4697.543
Coefficient of variation (CV)10.805876
Kurtosis393.06085
Mean434.72117
Median Absolute Deviation (MAD)0
Skewness18.990775
Sum86736437
Variance22066910
MonotonicityNot monotonic
2024-05-18T16:06:06.663192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 192143
96.3%
15024 788
 
0.4%
7688 609
 
0.3%
7298 582
 
0.3%
99999 390
 
0.2%
3103 237
 
0.1%
5178 207
 
0.1%
5013 158
 
0.1%
4386 151
 
0.1%
3325 121
 
0.1%
Other values (122) 4136
 
2.1%
ValueCountFrequency (%)
0 192143
96.3%
114 11
 
< 0.1%
401 33
 
< 0.1%
594 88
 
< 0.1%
914 17
 
< 0.1%
991 59
 
< 0.1%
1055 69
 
< 0.1%
1086 81
 
< 0.1%
1090 2
 
< 0.1%
1111 4
 
< 0.1%
ValueCountFrequency (%)
99999 390
0.2%
41310 2
 
< 0.1%
34095 11
 
< 0.1%
27828 94
 
< 0.1%
25236 23
 
< 0.1%
25124 18
 
< 0.1%
22040 2
 
< 0.1%
20051 91
 
< 0.1%
18481 14
 
< 0.1%
15831 16
 
< 0.1%

losses
Real number (ℝ)

ZEROS 

Distinct113
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.313975
Minimum0
Maximum4608
Zeros195616
Zeros (%)98.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:06.721292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4608
Range4608
Interquartile range (IQR)0

Descriptive statistics

Standard deviation271.8971
Coefficient of variation (CV)7.2867362
Kurtosis61.6326
Mean37.313975
Median Absolute Deviation (MAD)0
Skewness7.6325446
Sum7444959
Variance73928.031
MonotonicityNot monotonic
2024-05-18T16:06:06.785878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 195616
98.0%
1902 407
 
0.2%
1977 381
 
0.2%
1887 364
 
0.2%
1602 193
 
0.1%
2415 122
 
0.1%
1485 95
 
< 0.1%
1848 88
 
< 0.1%
1876 87
 
< 0.1%
1672 85
 
< 0.1%
Other values (103) 2084
 
1.0%
ValueCountFrequency (%)
0 195616
98.0%
155 1
 
< 0.1%
213 10
 
< 0.1%
323 10
 
< 0.1%
419 29
 
< 0.1%
625 25
 
< 0.1%
653 7
 
< 0.1%
772 5
 
< 0.1%
810 5
 
< 0.1%
880 9
 
< 0.1%
ValueCountFrequency (%)
4608 4
 
< 0.1%
4356 30
< 0.1%
3900 2
 
< 0.1%
3770 5
 
< 0.1%
3683 4
 
< 0.1%
3500 10
 
< 0.1%
3175 8
 
< 0.1%
3004 11
 
< 0.1%
2824 27
< 0.1%
2788 7
 
< 0.1%

divdends
Real number (ℝ)

SKEWED  ZEROS 

Distinct1478
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean197.53052
Minimum0
Maximum99999
Zeros178381
Zeros (%)89.4%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:06.836399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile400
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1984.1686
Coefficient of variation (CV)10.044871
Kurtosis1090.5583
Mean197.53052
Median Absolute Deviation (MAD)0
Skewness27.786433
Sum39411685
Variance3936925
MonotonicityNot monotonic
2024-05-18T16:06:06.886523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 178381
89.4%
100 1148
 
0.6%
500 1030
 
0.5%
1000 894
 
0.4%
200 866
 
0.4%
50 832
 
0.4%
2000 574
 
0.3%
250 555
 
0.3%
150 549
 
0.3%
300 523
 
0.3%
Other values (1468) 14170
 
7.1%
ValueCountFrequency (%)
0 178381
89.4%
1 472
 
0.2%
2 193
 
0.1%
3 129
 
0.1%
4 75
 
< 0.1%
5 179
 
0.1%
6 100
 
0.1%
7 93
 
< 0.1%
8 94
 
< 0.1%
9 56
 
< 0.1%
ValueCountFrequency (%)
99999 25
< 0.1%
95095 1
 
< 0.1%
75000 5
 
< 0.1%
70000 3
 
< 0.1%
66621 2
 
< 0.1%
60000 7
 
< 0.1%
57678 1
 
< 0.1%
55000 1
 
< 0.1%
54600 2
 
< 0.1%
54500 2
 
< 0.1%

liability
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Nonfiler
75093 
Joint both under 65
67383 
Single
37421 
Joint both 65+
8332 
Head of household
 
7426

Length

Max length29
Median length20
Mean length13.312993
Min length7

Characters and Unicode

Total characters2656235
Distinct characters24
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Head of household
2nd row Nonfiler
3rd row Nonfiler
4th row Nonfiler
5th row Joint both under 65

Common Values

ValueCountFrequency (%)
Nonfiler 75093
37.6%
Joint both under 65 67383
33.8%
Single 37421
18.8%
Joint both 65+ 8332
 
4.2%
Head of household 7426
 
3.7%
Joint one under 65 & one 65+ 3867
 
1.9%

Length

2024-05-18T16:06:06.931498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:06.971840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
65 83449
18.3%
joint 79582
17.4%
both 75715
16.6%
nonfiler 75093
16.5%
under 71250
15.6%
single 37421
8.2%
one 7734
 
1.7%
head 7426
 
1.6%
of 7426
 
1.6%
household 7426
 
1.6%

Most occurring characters

ValueCountFrequency (%)
456389
17.2%
n 271080
10.2%
o 260402
 
9.8%
e 206350
 
7.8%
i 192096
 
7.2%
t 155297
 
5.8%
r 146343
 
5.5%
l 119940
 
4.5%
h 90567
 
3.4%
d 86102
 
3.2%
Other values (14) 671669
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1817360
68.4%
Space Separator 456389
 
17.2%
Uppercase Letter 199522
 
7.5%
Decimal Number 166898
 
6.3%
Math Symbol 12199
 
0.5%
Other Punctuation 3867
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 271080
14.9%
o 260402
14.3%
e 206350
11.4%
i 192096
10.6%
t 155297
8.5%
r 146343
8.1%
l 119940
6.6%
h 90567
 
5.0%
d 86102
 
4.7%
f 82519
 
4.5%
Other values (5) 206664
11.4%
Uppercase Letter
ValueCountFrequency (%)
J 79582
39.9%
N 75093
37.6%
S 37421
18.8%
H 7426
 
3.7%
Decimal Number
ValueCountFrequency (%)
6 83449
50.0%
5 83449
50.0%
Space Separator
ValueCountFrequency (%)
456389
100.0%
Math Symbol
ValueCountFrequency (%)
+ 12199
100.0%
Other Punctuation
ValueCountFrequency (%)
& 3867
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2016882
75.9%
Common 639353
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 271080
13.4%
o 260402
12.9%
e 206350
10.2%
i 192096
9.5%
t 155297
 
7.7%
r 146343
 
7.3%
l 119940
 
5.9%
h 90567
 
4.5%
d 86102
 
4.3%
f 82519
 
4.1%
Other values (9) 406186
20.1%
Common
ValueCountFrequency (%)
456389
71.4%
6 83449
 
13.1%
5 83449
 
13.1%
+ 12199
 
1.9%
& 3867
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2656235
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
456389
17.2%
n 271080
10.2%
o 260402
 
9.8%
e 206350
 
7.8%
i 192096
 
7.2%
t 155297
 
5.8%
r 146343
 
5.5%
l 119940
 
4.5%
h 90567
 
3.4%
d 86102
 
3.2%
Other values (14) 671669
25.3%
Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:07.108662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length16
Mean length15.456872
Min length2

Characters and Unicode

Total characters3083986
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Arkansas
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe
ValueCountFrequency (%)
not 183749
32.2%
universe 183749
32.2%
in 183749
32.2%
california 1714
 
0.3%
north 1311
 
0.2%
utah 1063
 
0.2%
new 975
 
0.2%
carolina 907
 
0.2%
florida 849
 
0.1%
708
 
0.1%
Other values (46) 11228
 
2.0%
2024-05-18T16:06:07.311815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
570002
18.5%
i 380322
12.3%
n 377216
12.2%
e 373182
12.1%
o 195444
 
6.3%
r 192089
 
6.2%
s 189329
 
6.1%
t 189229
 
6.1%
N 186387
 
6.0%
u 184977
 
6.0%
Other values (36) 245809
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2311596
75.0%
Space Separator 570002
 
18.5%
Uppercase Letter 201680
 
6.5%
Other Punctuation 708
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 380322
16.5%
n 377216
16.3%
e 373182
16.1%
o 195444
8.5%
r 192089
8.3%
s 189329
8.2%
t 189229
8.2%
u 184977
8.0%
v 184122
8.0%
a 19048
 
0.8%
Other values (14) 26638
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
N 186387
92.4%
C 3093
 
1.5%
M 2539
 
1.3%
A 1625
 
0.8%
O 1073
 
0.5%
U 1063
 
0.5%
I 933
 
0.5%
F 849
 
0.4%
D 826
 
0.4%
W 577
 
0.3%
Other values (10) 2715
 
1.3%
Space Separator
ValueCountFrequency (%)
570002
100.0%
Other Punctuation
ValueCountFrequency (%)
? 708
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2513276
81.5%
Common 570710
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 380322
15.1%
n 377216
15.0%
e 373182
14.8%
o 195444
7.8%
r 192089
7.6%
s 189329
7.5%
t 189229
7.5%
N 186387
7.4%
u 184977
7.4%
v 184122
7.3%
Other values (34) 60979
 
2.4%
Common
ValueCountFrequency (%)
570002
99.9%
? 708
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3083986
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
570002
18.5%
i 380322
12.3%
n 377216
12.2%
e 373182
12.1%
o 195444
 
6.3%
r 192089
 
6.2%
s 189329
 
6.1%
t 189229
 
6.1%
N 186387
 
6.0%
u 184977
 
6.0%
Other values (36) 245809
8.0%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Householder
75475 
Child under 18 never married
50426 
Spouse of householder
41709 
Child 18 or older
14430 
Other relative of householder
9702 
Other values (3)
7780 

Length

Max length37
Median length30
Mean length20.287883
Min length12

Characters and Unicode

Total characters4047879
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Householder
2nd row Child 18 or older
3rd row Child under 18 never married
4th row Child under 18 never married
5th row Spouse of householder

Common Values

ValueCountFrequency (%)
Householder 75475
37.8%
Child under 18 never married 50426
25.3%
Spouse of householder 41709
20.9%
Child 18 or older 14430
 
7.2%
Other relative of householder 9702
 
4.9%
Nonrelative of householder 7601
 
3.8%
Group Quarters- Secondary individual 132
 
0.1%
Child under 18 ever married 47
 
< 0.1%

Length

2024-05-18T16:06:07.380375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:07.424234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
householder 134487
23.5%
child 64903
11.3%
18 64903
11.3%
of 59012
10.3%
under 50473
 
8.8%
married 50473
 
8.8%
never 50426
 
8.8%
spouse 41709
 
7.3%
older 14430
 
2.5%
or 14430
 
2.5%
Other values (8) 27580
 
4.8%

Most occurring characters

ValueCountFrequency (%)
572826
14.2%
e 571577
14.1%
o 406420
10.0%
r 392772
9.7%
d 315162
7.8%
h 268104
 
6.6%
l 231255
 
5.7%
u 227065
 
5.6%
s 176328
 
4.4%
i 133075
 
3.3%
Other values (19) 753295
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3145329
77.7%
Space Separator 572826
 
14.2%
Uppercase Letter 199786
 
4.9%
Decimal Number 129806
 
3.2%
Dash Punctuation 132
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 571577
18.2%
o 406420
12.9%
r 392772
12.5%
d 315162
10.0%
h 268104
8.5%
l 231255
7.4%
u 227065
 
7.2%
s 176328
 
5.6%
i 133075
 
4.2%
n 108764
 
3.5%
Other values (8) 314807
10.0%
Uppercase Letter
ValueCountFrequency (%)
H 75475
37.8%
C 64903
32.5%
S 41841
20.9%
O 9702
 
4.9%
N 7601
 
3.8%
G 132
 
0.1%
Q 132
 
0.1%
Decimal Number
ValueCountFrequency (%)
8 64903
50.0%
1 64903
50.0%
Space Separator
ValueCountFrequency (%)
572826
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 132
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3345115
82.6%
Common 702764
 
17.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 571577
17.1%
o 406420
12.1%
r 392772
11.7%
d 315162
9.4%
h 268104
8.0%
l 231255
6.9%
u 227065
 
6.8%
s 176328
 
5.3%
i 133075
 
4.0%
n 108764
 
3.3%
Other values (15) 514593
15.4%
Common
ValueCountFrequency (%)
572826
81.5%
8 64903
 
9.2%
1 64903
 
9.2%
- 132
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4047879
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
572826
14.2%
e 571577
14.1%
o 406420
10.0%
r 392772
9.7%
d 315162
7.8%
h 268104
 
6.6%
l 231255
 
5.7%
u 227065
 
5.6%
s 176328
 
4.4%
i 133075
 
3.3%
Other values (19) 753295
18.6%

instance_weight
Real number (ℝ)

Distinct99800
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1740.3805
Minimum37.87
Maximum18656.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:07.480263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.87
5-th percentile395.341
Q11061.6075
median1618.31
Q32188.61
95-th percentile3585.9095
Maximum18656.3
Range18618.43
Interquartile range (IQR)1127.0025

Descriptive statistics

Standard deviation993.77064
Coefficient of variation (CV)0.5710077
Kurtosis5.4124708
Mean1740.3805
Median Absolute Deviation (MAD)561.465
Skewness1.432729
Sum3.4724419 × 108
Variance987580.09
MonotonicityNot monotonic
2024-05-18T16:06:07.531750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1191.21 32
 
< 0.1%
753.23 32
 
< 0.1%
1787.34 32
 
< 0.1%
1601.4 32
 
< 0.1%
1317.51 31
 
< 0.1%
707.9 31
 
< 0.1%
1070.15 30
 
< 0.1%
1002.02 28
 
< 0.1%
1839.19 28
 
< 0.1%
1033.83 28
 
< 0.1%
Other values (99790) 199218
99.8%
ValueCountFrequency (%)
37.87 1
 
< 0.1%
39.11 1
 
< 0.1%
40.67 2
 
< 0.1%
42.82 2
 
< 0.1%
43.26 3
< 0.1%
45.74 2
 
< 0.1%
47.83 6
< 0.1%
49.82 2
 
< 0.1%
52.43 1
 
< 0.1%
52.46 4
< 0.1%
ValueCountFrequency (%)
18656.3 1
< 0.1%
16349.2 1
< 0.1%
13911.5 1
< 0.1%
13145.1 1
< 0.1%
13114.2 1
< 0.1%
12960.2 1
< 0.1%
12399.9 1
< 0.1%
12184.5 1
< 0.1%
11958.4 1
< 0.1%
11863 1
< 0.1%

migration_msa
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99695 
Nonmover
82538 
MSA to MSA
10601 
NonMSA to nonMSA
 
2811
Not in universe
 
1516
Other values (5)
 
2361

Length

Max length17
Median length16
Mean length5.8412055
Min length2

Characters and Unicode

Total characters1165449
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row MSA to MSA
2nd row ?
3rd row Nonmover
4th row Nonmover
5th row ?

Common Values

ValueCountFrequency (%)
? 99695
50.0%
Nonmover 82538
41.4%
MSA to MSA 10601
 
5.3%
NonMSA to nonMSA 2811
 
1.4%
Not in universe 1516
 
0.8%
MSA to nonMSA 790
 
0.4%
NonMSA to MSA 615
 
0.3%
Abroad to MSA 453
 
0.2%
Not identifiable 430
 
0.2%
Abroad to nonMSA 73
 
< 0.1%

Length

2024-05-18T16:06:07.582333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:07.628818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
99695
42.7%
nonmover 82538
35.3%
msa 23060
 
9.9%
to 15343
 
6.6%
nonmsa 7100
 
3.0%
not 1946
 
0.8%
in 1516
 
0.6%
universe 1516
 
0.6%
abroad 526
 
0.2%
identifiable 430
 
0.2%

Most occurring characters

ValueCountFrequency (%)
233670
20.0%
o 189991
16.3%
? 99695
8.6%
n 96774
8.3%
N 87910
 
7.5%
e 86430
 
7.4%
r 84580
 
7.3%
v 84054
 
7.2%
m 82538
 
7.1%
A 30686
 
2.6%
Other values (11) 89121
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 653168
56.0%
Space Separator 233670
 
20.0%
Uppercase Letter 178916
 
15.4%
Other Punctuation 99695
 
8.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 189991
29.1%
n 96774
14.8%
e 86430
13.2%
r 84580
12.9%
v 84054
12.9%
m 82538
12.6%
t 17719
 
2.7%
i 4322
 
0.7%
u 1516
 
0.2%
s 1516
 
0.2%
Other values (5) 3728
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
N 87910
49.1%
A 30686
 
17.2%
S 30160
 
16.9%
M 30160
 
16.9%
Space Separator
ValueCountFrequency (%)
233670
100.0%
Other Punctuation
ValueCountFrequency (%)
? 99695
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 832084
71.4%
Common 333365
28.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 189991
22.8%
n 96774
11.6%
N 87910
10.6%
e 86430
10.4%
r 84580
10.2%
v 84054
10.1%
m 82538
9.9%
A 30686
 
3.7%
S 30160
 
3.6%
M 30160
 
3.6%
Other values (9) 28801
 
3.5%
Common
ValueCountFrequency (%)
233670
70.1%
? 99695
29.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1165449
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
233670
20.0%
o 189991
16.3%
? 99695
8.6%
n 96774
8.3%
N 87910
 
7.5%
e 86430
 
7.4%
r 84580
 
7.3%
v 84054
 
7.2%
m 82538
 
7.1%
A 30686
 
2.6%
Other values (11) 89121
 
7.6%

migration_reg
Categorical

IMBALANCE 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99695 
Nonmover
82538 
Same county
 
9812
Different county same state
 
2797
Not in universe
 
1516
Other values (4)
 
3164

Length

Max length31
Median length30
Mean length6.1668839
Min length2

Characters and Unicode

Total characters1230429
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Same county
2nd row ?
3rd row Nonmover
4th row Nonmover
5th row ?

Common Values

ValueCountFrequency (%)
? 99695
50.0%
Nonmover 82538
41.4%
Same county 9812
 
4.9%
Different county same state 2797
 
1.4%
Not in universe 1516
 
0.8%
Different region 1178
 
0.6%
Different state same division 991
 
0.5%
Abroad 530
 
0.3%
Different division same region 465
 
0.2%

Length

2024-05-18T16:06:07.681800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:07.729716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
99695
44.1%
nonmover 82538
36.5%
same 14065
 
6.2%
county 12609
 
5.6%
different 5431
 
2.4%
state 3788
 
1.7%
region 1643
 
0.7%
not 1516
 
0.7%
in 1516
 
0.7%
universe 1516
 
0.7%
Other values (2) 1986
 
0.9%

Most occurring characters

ValueCountFrequency (%)
226303
18.4%
o 182830
14.9%
e 115928
9.4%
n 106709
8.7%
? 99695
8.1%
m 96603
7.9%
r 91658
7.4%
v 85510
 
6.9%
N 84054
 
6.8%
t 27132
 
2.2%
Other values (13) 114007
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 804604
65.4%
Space Separator 226303
 
18.4%
Uppercase Letter 99827
 
8.1%
Other Punctuation 99695
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 182830
22.7%
e 115928
14.4%
n 106709
13.3%
m 96603
12.0%
r 91658
11.4%
v 85510
10.6%
t 27132
 
3.4%
a 18383
 
2.3%
i 14474
 
1.8%
u 14125
 
1.8%
Other values (7) 51252
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
N 84054
84.2%
S 9812
 
9.8%
D 5431
 
5.4%
A 530
 
0.5%
Space Separator
ValueCountFrequency (%)
226303
100.0%
Other Punctuation
ValueCountFrequency (%)
? 99695
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 904431
73.5%
Common 325998
 
26.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 182830
20.2%
e 115928
12.8%
n 106709
11.8%
m 96603
10.7%
r 91658
10.1%
v 85510
9.5%
N 84054
9.3%
t 27132
 
3.0%
a 18383
 
2.0%
i 14474
 
1.6%
Other values (11) 81150
9.0%
Common
ValueCountFrequency (%)
226303
69.4%
? 99695
30.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1230429
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
226303
18.4%
o 182830
14.9%
e 115928
9.4%
n 106709
8.7%
? 99695
8.1%
m 96603
7.9%
r 91658
7.4%
v 85510
 
6.9%
N 84054
 
6.8%
t 27132
 
2.2%
Other values (13) 114007
9.3%

migration_within
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99695 
Nonmover
82538 
Same county
 
9812
Different county same state
 
2797
Not in universe
 
1516
Other values (5)
 
3164

Length

Max length29
Median length28
Mean length6.1860597
Min length2

Characters and Unicode

Total characters1234255
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Same county
2nd row ?
3rd row Nonmover
4th row Nonmover
5th row ?

Common Values

ValueCountFrequency (%)
? 99695
50.0%
Nonmover 82538
41.4%
Same county 9812
 
4.9%
Different county same state 2797
 
1.4%
Not in universe 1516
 
0.8%
Different state in South 973
 
0.5%
Different state in West 679
 
0.3%
Different state in Midwest 551
 
0.3%
Abroad 530
 
0.3%
Different state in Northeast 431
 
0.2%

Length

2024-05-18T16:06:07.786834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:07.834862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
99695
43.6%
nonmover 82538
36.1%
same 12609
 
5.5%
county 12609
 
5.5%
different 5431
 
2.4%
state 5431
 
2.4%
in 4150
 
1.8%
not 1516
 
0.7%
universe 1516
 
0.7%
south 973
 
0.4%
Other values (4) 2191
 
1.0%

Most occurring characters

ValueCountFrequency (%)
228659
18.5%
o 181135
14.7%
e 116133
9.4%
n 106244
8.6%
? 99695
8.1%
m 95147
7.7%
r 90446
 
7.3%
N 84485
 
6.8%
v 84054
 
6.8%
t 33483
 
2.7%
Other values (16) 114774
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 803440
65.1%
Space Separator 228659
 
18.5%
Uppercase Letter 102461
 
8.3%
Other Punctuation 99695
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 181135
22.5%
e 116133
14.5%
n 106244
13.2%
m 95147
11.8%
r 90446
11.3%
v 84054
10.5%
t 33483
 
4.2%
a 19001
 
2.4%
u 15098
 
1.9%
c 12609
 
1.6%
Other values (8) 50090
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
N 84485
82.5%
S 10785
 
10.5%
D 5431
 
5.3%
W 679
 
0.7%
M 551
 
0.5%
A 530
 
0.5%
Space Separator
ValueCountFrequency (%)
228659
100.0%
Other Punctuation
ValueCountFrequency (%)
? 99695
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 905901
73.4%
Common 328354
 
26.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 181135
20.0%
e 116133
12.8%
n 106244
11.7%
m 95147
10.5%
r 90446
10.0%
N 84485
9.3%
v 84054
9.3%
t 33483
 
3.7%
a 19001
 
2.1%
u 15098
 
1.7%
Other values (14) 80675
8.9%
Common
ValueCountFrequency (%)
228659
69.6%
? 99695
30.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1234255
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
228659
18.5%
o 181135
14.7%
e 116133
9.4%
n 106244
8.6%
? 99695
8.1%
m 95147
7.7%
r 90446
 
7.3%
N 84485
 
6.8%
v 84054
 
6.8%
t 33483
 
2.7%
Other values (16) 114774
9.3%

live_one_year
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe under 1 year old
101211 
Yes
82538 
No
15773 

Length

Max length33
Median length33
Mean length18.6317
Min length3

Characters and Unicode

Total characters3717434
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row No
2nd row Not in universe under 1 year old
3rd row Yes
4th row Yes
5th row Not in universe under 1 year old

Common Values

ValueCountFrequency (%)
Not in universe under 1 year old 101211
50.7%
Yes 82538
41.4%
No 15773
 
7.9%

Length

2024-05-18T16:06:07.888551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:07.926360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 101211
12.5%
in 101211
12.5%
universe 101211
12.5%
under 101211
12.5%
1 101211
12.5%
year 101211
12.5%
old 101211
12.5%
yes 82538
10.2%
no 15773
 
2.0%

Most occurring characters

ValueCountFrequency (%)
806788
21.7%
e 487382
13.1%
n 303633
 
8.2%
r 303633
 
8.2%
o 218195
 
5.9%
i 202422
 
5.4%
u 202422
 
5.4%
d 202422
 
5.4%
s 183749
 
4.9%
N 116984
 
3.1%
Other values (7) 689804
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2609913
70.2%
Space Separator 806788
 
21.7%
Uppercase Letter 199522
 
5.4%
Decimal Number 101211
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 487382
18.7%
n 303633
11.6%
r 303633
11.6%
o 218195
8.4%
i 202422
7.8%
u 202422
7.8%
d 202422
7.8%
s 183749
 
7.0%
t 101211
 
3.9%
v 101211
 
3.9%
Other values (3) 303633
11.6%
Uppercase Letter
ValueCountFrequency (%)
N 116984
58.6%
Y 82538
41.4%
Space Separator
ValueCountFrequency (%)
806788
100.0%
Decimal Number
ValueCountFrequency (%)
1 101211
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2809435
75.6%
Common 907999
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 487382
17.3%
n 303633
10.8%
r 303633
10.8%
o 218195
7.8%
i 202422
7.2%
u 202422
7.2%
d 202422
7.2%
s 183749
 
6.5%
N 116984
 
4.2%
t 101211
 
3.6%
Other values (5) 487382
17.3%
Common
ValueCountFrequency (%)
806788
88.9%
1 101211
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3717434
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
806788
21.7%
e 487382
13.1%
n 303633
 
8.2%
r 303633
 
8.2%
o 218195
 
5.9%
i 202422
 
5.4%
u 202422
 
5.4%
d 202422
 
5.4%
s 183749
 
4.9%
N 116984
 
3.1%
Other values (7) 689804
18.6%

sunbelt
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99695 
Not in universe
84054 
No
9987 
Yes
 
5786

Length

Max length16
Median length4
Mean length8.0059292
Min length2

Characters and Unicode

Total characters1597359
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Yes
2nd row ?
3rd row Not in universe
4th row Not in universe
5th row ?

Common Values

ValueCountFrequency (%)
? 99695
50.0%
Not in universe 84054
42.1%
No 9987
 
5.0%
Yes 5786
 
2.9%

Length

2024-05-18T16:06:07.970700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:08.015791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
99695
27.1%
not 84054
22.9%
in 84054
22.9%
universe 84054
22.9%
no 9987
 
2.7%
yes 5786
 
1.6%

Most occurring characters

ValueCountFrequency (%)
367630
23.0%
e 173894
10.9%
i 168108
10.5%
n 168108
10.5%
? 99695
 
6.2%
N 94041
 
5.9%
o 94041
 
5.9%
s 89840
 
5.6%
t 84054
 
5.3%
u 84054
 
5.3%
Other values (3) 173894
10.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1030207
64.5%
Space Separator 367630
 
23.0%
Uppercase Letter 99827
 
6.2%
Other Punctuation 99695
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 173894
16.9%
i 168108
16.3%
n 168108
16.3%
o 94041
9.1%
s 89840
8.7%
t 84054
8.2%
u 84054
8.2%
v 84054
8.2%
r 84054
8.2%
Uppercase Letter
ValueCountFrequency (%)
N 94041
94.2%
Y 5786
 
5.8%
Space Separator
ValueCountFrequency (%)
367630
100.0%
Other Punctuation
ValueCountFrequency (%)
? 99695
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1130034
70.7%
Common 467325
29.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 173894
15.4%
i 168108
14.9%
n 168108
14.9%
N 94041
8.3%
o 94041
8.3%
s 89840
8.0%
t 84054
7.4%
u 84054
7.4%
v 84054
7.4%
r 84054
7.4%
Common
ValueCountFrequency (%)
367630
78.7%
? 99695
 
21.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1597359
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
367630
23.0%
e 173894
10.9%
i 168108
10.5%
n 168108
10.5%
? 99695
 
6.2%
N 94041
 
5.9%
o 94041
 
5.9%
s 89840
 
5.6%
t 84054
 
5.3%
u 84054
 
5.3%
Other values (3) 173894
10.9%

person_worked
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9561903
Minimum0
Maximum6
Zeros95982
Zeros (%)48.1%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:08.061945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.3651274
Coefficient of variation (CV)1.2090477
Kurtosis-1.0822581
Mean1.9561903
Median Absolute Deviation (MAD)1
Skewness0.75155306
Sum390303
Variance5.5938275
MonotonicityNot monotonic
2024-05-18T16:06:08.099304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 95982
48.1%
6 36511
 
18.3%
1 23109
 
11.6%
4 14379
 
7.2%
3 13425
 
6.7%
2 10081
 
5.1%
5 6035
 
3.0%
ValueCountFrequency (%)
0 95982
48.1%
1 23109
 
11.6%
2 10081
 
5.1%
3 13425
 
6.7%
4 14379
 
7.2%
5 6035
 
3.0%
6 36511
 
18.3%
ValueCountFrequency (%)
6 36511
 
18.3%
5 6035
 
3.0%
4 14379
 
7.2%
3 13425
 
6.7%
2 10081
 
5.1%
1 23109
 
11.6%
0 95982
48.1%

under18
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
144231 
Both parents present
38983 
Mother only present
 
12772
Father only present
 
1883
Neither parent present
 
1653

Length

Max length23
Median length16
Mean length17.328706
Min length16

Characters and Unicode

Total characters3457458
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Both parents present
4th row Both parents present
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe 144231
72.3%
Both parents present 38983
 
19.5%
Mother only present 12772
 
6.4%
Father only present 1883
 
0.9%
Neither parent present 1653
 
0.8%

Length

2024-05-18T16:06:08.145407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:08.188501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 144231
24.1%
in 144231
24.1%
universe 144231
24.1%
present 55291
 
9.2%
both 38983
 
6.5%
parents 38983
 
6.5%
only 14655
 
2.4%
mother 12772
 
2.1%
father 1883
 
0.3%
neither 1653
 
0.3%

Most occurring characters

ValueCountFrequency (%)
598566
17.3%
e 457641
13.2%
n 399044
11.5%
t 295449
8.5%
i 290115
8.4%
r 256466
7.4%
s 238505
 
6.9%
o 210641
 
6.1%
N 145884
 
4.2%
u 144231
 
4.2%
Other values (9) 420916
12.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2659370
76.9%
Space Separator 598566
 
17.3%
Uppercase Letter 199522
 
5.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 457641
17.2%
n 399044
15.0%
t 295449
11.1%
i 290115
10.9%
r 256466
9.6%
s 238505
9.0%
o 210641
7.9%
u 144231
 
5.4%
v 144231
 
5.4%
p 95927
 
3.6%
Other values (4) 127120
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
N 145884
73.1%
B 38983
 
19.5%
M 12772
 
6.4%
F 1883
 
0.9%
Space Separator
ValueCountFrequency (%)
598566
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2858892
82.7%
Common 598566
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 457641
16.0%
n 399044
14.0%
t 295449
10.3%
i 290115
10.1%
r 256466
9.0%
s 238505
8.3%
o 210641
7.4%
N 145884
 
5.1%
u 144231
 
5.0%
v 144231
 
5.0%
Other values (8) 276685
9.7%
Common
ValueCountFrequency (%)
598566
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3457458
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
598566
17.3%
e 457641
13.2%
n 399044
11.5%
t 295449
8.5%
i 290115
8.4%
r 256466
7.4%
s 238505
 
6.9%
o 210641
 
6.1%
N 145884
 
4.2%
u 144231
 
4.2%
Other values (9) 420916
12.2%

citizen
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Native- Born in the United States
176991 
Foreign born- Not a citizen of U S
 
13401
Foreign born- U S citizen by naturalization
 
5855
Native- Born abroad of American Parent(s)
 
1756
Native- Born in Puerto Rico or U S Outlying
 
1519

Length

Max length44
Median length34
Mean length34.574323
Min length34

Characters and Unicode

Total characters6898338
Distinct characters33
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Native- Born in the United States
2nd row Foreign born- Not a citizen of U S
3rd row Native- Born in the United States
4th row Native- Born in the United States
5th row Native- Born in the United States

Common Values

ValueCountFrequency (%)
Native- Born in the United States 176991
88.7%
Foreign born- Not a citizen of U S 13401
 
6.7%
Foreign born- U S citizen by naturalization 5855
 
2.9%
Native- Born abroad of American Parent(s) 1756
 
0.9%
Native- Born in Puerto Rico or U S Outlying 1519
 
0.8%

Length

2024-05-18T16:06:08.235285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:08.276595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
born 199522
16.2%
native 180266
14.6%
in 178510
14.5%
the 176991
14.3%
united 176991
14.3%
states 176991
14.3%
s 20775
 
1.7%
u 20775
 
1.7%
citizen 19256
 
1.6%
foreign 19256
 
1.6%
Other values (12) 65013
 
5.3%

Most occurring characters

ValueCountFrequency (%)
1247747
18.1%
t 937391
13.6%
e 754782
10.9%
n 610276
8.8%
i 610039
8.8%
a 395247
 
5.7%
o 259504
 
3.8%
r 232939
 
3.4%
- 199522
 
2.9%
S 197766
 
2.9%
Other values (23) 1453125
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4650767
67.4%
Space Separator 1247747
 
18.1%
Uppercase Letter 796790
 
11.6%
Dash Punctuation 199522
 
2.9%
Open Punctuation 1756
 
< 0.1%
Close Punctuation 1756
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 937391
20.2%
e 754782
16.2%
n 610276
13.1%
i 610039
13.1%
a 395247
8.5%
o 259504
 
5.6%
r 232939
 
5.0%
v 180266
 
3.9%
s 178747
 
3.8%
d 178747
 
3.8%
Other values (10) 312829
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
S 197766
24.8%
U 197766
24.8%
N 193667
24.3%
B 180266
22.6%
F 19256
 
2.4%
P 3275
 
0.4%
A 1756
 
0.2%
R 1519
 
0.2%
O 1519
 
0.2%
Space Separator
ValueCountFrequency (%)
1247747
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 199522
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1756
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1756
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5447557
79.0%
Common 1450781
 
21.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 937391
17.2%
e 754782
13.9%
n 610276
11.2%
i 610039
11.2%
a 395247
 
7.3%
o 259504
 
4.8%
r 232939
 
4.3%
S 197766
 
3.6%
U 197766
 
3.6%
N 193667
 
3.6%
Other values (19) 1058180
19.4%
Common
ValueCountFrequency (%)
1247747
86.0%
- 199522
 
13.8%
( 1756
 
0.1%
) 1756
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6898338
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1247747
18.1%
t 937391
13.6%
e 754782
10.9%
n 610276
8.8%
i 610039
8.8%
a 395247
 
5.7%
o 259504
 
3.8%
r 232939
 
3.4%
- 199522
 
2.9%
S 197766
 
2.9%
Other values (23) 1453125
21.1%

person_income
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
180671 
2
 
16153
1
 
2698

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters199522
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row2

Common Values

ValueCountFrequency (%)
0 180671
90.6%
2 16153
 
8.1%
1 2698
 
1.4%

Length

2024-05-18T16:06:08.322551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:08.358491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 180671
90.6%
2 16153
 
8.1%
1 2698
 
1.4%

Most occurring characters

ValueCountFrequency (%)
0 180671
90.6%
2 16153
 
8.1%
1 2698
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 199522
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 180671
90.6%
2 16153
 
8.1%
1 2698
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Common 199522
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 180671
90.6%
2 16153
 
8.1%
1 2698
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 199522
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 180671
90.6%
2 16153
 
8.1%
1 2698
 
1.4%

own_bus
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
197538 
No
 
1593
Yes
 
391

Length

Max length16
Median length16
Mean length15.872691
Min length3

Characters and Unicode

Total characters3166951
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe 197538
99.0%
No 1593
 
0.8%
Yes 391
 
0.2%

Length

2024-05-18T16:06:08.398907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:08.435972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 197538
33.2%
in 197538
33.2%
universe 197538
33.2%
no 1593
 
0.3%
yes 391
 
0.1%

Most occurring characters

ValueCountFrequency (%)
594598
18.8%
e 395467
12.5%
i 395076
12.5%
n 395076
12.5%
N 199131
 
6.3%
o 199131
 
6.3%
s 197929
 
6.2%
t 197538
 
6.2%
u 197538
 
6.2%
v 197538
 
6.2%
Other values (2) 197929
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2372831
74.9%
Space Separator 594598
 
18.8%
Uppercase Letter 199522
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 395467
16.7%
i 395076
16.6%
n 395076
16.6%
o 199131
8.4%
s 197929
8.3%
t 197538
8.3%
u 197538
8.3%
v 197538
8.3%
r 197538
8.3%
Uppercase Letter
ValueCountFrequency (%)
N 199131
99.8%
Y 391
 
0.2%
Space Separator
ValueCountFrequency (%)
594598
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2572353
81.2%
Common 594598
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 395467
15.4%
i 395076
15.4%
n 395076
15.4%
N 199131
7.7%
o 199131
7.7%
s 197929
7.7%
t 197538
7.7%
u 197538
7.7%
v 197538
7.7%
r 197538
7.7%
Common
ValueCountFrequency (%)
594598
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3166951
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
594598
18.8%
e 395467
12.5%
i 395076
12.5%
n 395076
12.5%
N 199131
 
6.3%
o 199131
 
6.3%
s 197929
 
6.2%
t 197538
 
6.2%
u 197538
 
6.2%
v 197538
 
6.2%
Other values (2) 197929
 
6.2%

week_workd
Real number (ℝ)

ZEROS 

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.175013
Minimum0
Maximum52
Zeros95982
Zeros (%)48.1%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2024-05-18T16:06:08.481946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q352
95-th percentile52
Maximum52
Range52
Interquartile range (IQR)52

Descriptive statistics

Standard deviation24.411494
Coefficient of variation (CV)1.0533541
Kurtosis-1.8638093
Mean23.175013
Median Absolute Deviation (MAD)8
Skewness0.21016025
Sum4623925
Variance595.92105
MonotonicityNot monotonic
2024-05-18T16:06:08.542230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 95982
48.1%
52 70314
35.2%
40 2790
 
1.4%
50 2304
 
1.2%
26 2268
 
1.1%
48 1806
 
0.9%
12 1780
 
0.9%
30 1378
 
0.7%
20 1330
 
0.7%
8 1126
 
0.6%
Other values (43) 18444
 
9.2%
ValueCountFrequency (%)
0 95982
48.1%
1 464
 
0.2%
2 458
 
0.2%
3 417
 
0.2%
4 757
 
0.4%
5 309
 
0.2%
6 646
 
0.3%
7 152
 
0.1%
8 1126
 
0.6%
9 239
 
0.1%
ValueCountFrequency (%)
52 70314
35.2%
51 819
 
0.4%
50 2304
 
1.2%
49 509
 
0.3%
48 1806
 
0.9%
47 278
 
0.1%
46 708
 
0.4%
45 669
 
0.3%
44 845
 
0.4%
43 374
 
0.2%

income
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
187140 
1
 
12382

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters199522
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 187140
93.8%
1 12382
 
6.2%

Length

2024-05-18T16:06:08.599597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:06:08.636494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 187140
93.8%
1 12382
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0 187140
93.8%
1 12382
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 199522
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 187140
93.8%
1 12382
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Common 199522
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 187140
93.8%
1 12382
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 199522
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 187140
93.8%
1 12382
 
6.2%

Interactions

2024-05-18T16:06:03.468398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:58.887505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.374956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.794383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.206939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.659775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.086314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.491609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.899147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.344324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.511439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:58.938913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.415892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.833300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.246084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.701814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.128619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.529102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.944654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.386392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.552282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:58.987243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.456981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.873262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.290314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.756611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.170346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.569500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.988383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.429795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.591787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.039237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.500378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.912999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.330239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.797809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.209906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.611110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.030989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.473539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.634566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.100074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.542701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.954529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.370046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.838288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.248743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.654293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.074417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.536621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.677545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.150416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.583590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.997323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.409878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.878211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.289212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.695738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.119780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.588882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.716741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.197103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.624603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.039337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.448941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.917097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.326022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.733897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.162073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.631059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.757339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.239676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.667274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.080531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.498347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.956554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.365045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.773523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.204888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.688100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.800494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.284501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.712648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.124719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.550208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.001786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.407116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.816505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.250468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.735753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.841395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.333866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:05:59.754487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.166399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:00.602489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.046442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.451163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:01.858018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:02.295531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T16:06:03.421457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-05-18T16:06:03.977539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T16:06:04.428987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ageclass_of_workerindustry_codeoccupation_codeeducationwage_per_houredu_instmaritalmacehispanicsexlabor_unionreason_unemploymentemployment_typegainslossesdivdendsliabilitystate_residencehousehold_summaryinstance_weightmigration_msamigration_regmigration_withinlive_one_yearsunbeltperson_workedunder18citizenperson_incomeown_busweek_workdincome
058Self-employed-not incorporated434Some college but no degree0Not in universeDivorcedWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000Head of householdArkansasHouseholder1053.55MSA to MSASame countySame countyNoYes1Not in universeNative- Born in the United States0Not in universe520
118Not in universe0010th grade0High schoolNever marriedAsian or Pacific IslanderAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeChild 18 or older991.95???Not in universe under 1 year old?0Not in universeForeign born- Not a citizen of U S0Not in universe00
29Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1758.14NonmoverNonmoverNonmoverYesNot in universe0Both parents presentNative- Born in the United States0Not in universe00
310Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1069.16NonmoverNonmoverNonmoverYesNot in universe0Both parents presentNative- Born in the United States0Not in universe00
448Private4010Some college but no degree1200Not in universeMarried-civilian spouse presentAmer Indian Aleut or EskimoAll otherFemaleNoNot in universeFull-time schedules000Joint both under 65Not in universeSpouse of householder162.61???Not in universe under 1 year old?1Not in universeNative- Born in the United States2Not in universe520
542Private343Bachelors degree(BA AB BS)0Not in universeMarried-civilian spouse presentWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces517800Joint both under 65Not in universeHouseholder1535.86NonmoverNonmoverNonmoverYesNot in universe6Not in universeNative- Born in the United States0Not in universe520
628Private440High school graduate0Not in universeNever marriedWhiteAll otherFemaleNot in universeJob loser - on layoffUnemployed full-time000SingleNot in universeNonrelative of householder898.83???Not in universe under 1 year old?4Not in universeNative- Born in the United States0Not in universe300
747Local government4326Some college but no degree876Not in universeMarried-civilian spouse presentWhiteAll otherFemaleNoNot in universeFull-time schedules000Joint both under 65Not in universeSpouse of householder1661.53???Not in universe under 1 year old?5Not in universeNative- Born in the United States0Not in universe520
834Private437Some college but no degree0Not in universeMarried-civilian spouse presentWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000Joint both under 65Not in universeHouseholder1146.79NonmoverNonmoverNonmoverYesNot in universe6Not in universeNative- Born in the United States0Not in universe520
98Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married2466.24NonmoverNonmoverNonmoverYesNot in universe0Both parents presentNative- Born in the United States0Not in universe00
ageclass_of_workerindustry_codeoccupation_codeeducationwage_per_houredu_instmaritalmacehispanicsexlabor_unionreason_unemploymentemployment_typegainslossesdivdendsliabilitystate_residencehousehold_summaryinstance_weightmigration_msamigration_regmigration_withinlive_one_yearsunbeltperson_workedunder18citizenperson_incomeown_busweek_workdincome
19951257Private9379th grade0Not in universeDivorcedWhiteCentral or South AmericanFemaleNot in universeNot in universeFull-time schedules000SingleNot in universeHouseholder743.66???Not in universe under 1 year old?4Not in universeForeign born- Not a citizen of U S0Not in universe520
19951351Private331910th grade0Not in universeWidowedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000SingleNorth DakotaHouseholder1302.34NonMSA to nonMSASame countySame countyNoYes6Not in universeNative- Born in the United States0Not in universe520
19951487Not in universe00High school graduate0Not in universeWidowedWhiteAll otherFemaleNot in universeNot in universeNot in labor force000SingleNot in universeHouseholder3255.80???Not in universe under 1 year old?0Not in universeNative- Born in the United States0Not in universe00
1995153Not in universe00Children0Not in universeNever marriedBlackAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerUtahNonrelative of householder2733.75MSA to MSASame countySame countyNoYes0Mother only presentNative- Born in the United States0Not in universe00
19951639Private4326Bachelors degree(BA AB BS)0Not in universeNever marriedOtherMexican-AmericanMaleNoNot in universeFull-time schedules684900SingleNot in universeHouseholder908.14???Not in universe under 1 year old?6Not in universeForeign born- Not a citizen of U S2Not in universe520
19951787Not in universe007th and 8th grade0Not in universeMarried-civilian spouse presentWhiteAll otherMaleNot in universeNot in universeNot in labor force000Joint both 65+Not in universeHouseholder955.27???Not in universe under 1 year old?0Not in universeNative- Born in the United States0Not in universe00
19951865Self-employed-incorporated37211th grade0Not in universeMarried-civilian spouse presentWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces641809Joint one under 65 & one 65+Not in universeHouseholder687.19NonmoverNonmoverNonmoverYesNot in universe1Not in universeNative- Born in the United States0Not in universe520
19951947Not in universe00Some college but no degree0Not in universeMarried-civilian spouse presentWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces00157Joint both under 65Not in universeHouseholder1923.03???Not in universe under 1 year old?6Not in universeForeign born- U S citizen by naturalization0Not in universe520
19952016Not in universe0010th grade0High schoolNever marriedWhiteAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeChild under 18 never married4664.87???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe00
19952132Private4230High school graduate0Not in universeNever marriedBlackAll otherFemaleNoNot in universeChildren or Armed Forces000SingleNot in universeHouseholder1830.11NonmoverNonmoverNonmoverYesNot in universe6Not in universeForeign born- Not a citizen of U S0Not in universe520

Duplicate rows

Most frequently occurring

ageclass_of_workerindustry_codeoccupation_codeeducationwage_per_houredu_instmaritalmacehispanicsexlabor_unionreason_unemploymentemployment_typegainslossesdivdendsliabilitystate_residencehousehold_summaryinstance_weightmigration_msamigration_regmigration_withinlive_one_yearsunbeltperson_workedunder18citizenperson_incomeown_busweek_workdincome# duplicates
1180Not in universe00Children0Not in universeNever marriedWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1363.88???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe006
1200Not in universe00Children0Not in universeNever marriedWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1366.71???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe006
6543Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married2125.99???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe006
203810Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1185.19???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe006
224311Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1131.62NonmoverNonmoverNonmoverYesNot in universe0Both parents presentNative- Born in the United States0Not in universe006
270813Not in universe00Children0Not in universeNever marriedWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married981.79NonmoverNonmoverNonmoverYesNot in universe0Both parents presentNative- Born in the United States0Not in universe006
271013Not in universe00Children0Not in universeNever marriedWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1013.75NonmoverNonmoverNonmoverYesNot in universe0Both parents presentNative- Born in the United States0Not in universe006
3041Not in universe00Children0Not in universeNever marriedWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1175.86???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe005
4082Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married933.97???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe005
4272Not in universe00Children0Not in universeNever marriedWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeChild under 18 never married1182.42???Not in universe under 1 year old?0Both parents presentNative- Born in the United States0Not in universe005